SenTube: A Corpus for Sentiment Analysis on YouTube Social Media

نویسندگان

  • Olga Uryupina
  • Barbara Plank
  • Aliaksei Severyn
  • Agata Rotondi
  • Alessandro Moschitti
چکیده

In this paper we present SenTube – a dataset of user-generated comments on YouTube videos annotated for information content and sentiment polarity. It contains annotations that allow to develop classifiers for several important NLP tasks: (i) sentiment analysis, (ii) text categorization (relatedness of a comment to video and/or product), (iii) spam detection, and (iv) prediction of comment informativeness. The SenTube corpus favors the development of research on indexing and searching YouTube videos exploiting information derived from comments. The corpus will cover several languages: at the moment, we focus on English and Italian, with Spanish and Dutch parts scheduled for the later stages of the project. For all the languages, we collect videos for the same set of products, thus offering possibilities for multiand cross-lingual experiments. The paper provides annotation guidelines, corpus statistics and annotator agreement details.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sentiment Analysis on YouTube: A Brief Survey

Sentiment analysis or opinion mining is the field of study related to analyze opinions, sentiments, evaluations, attitudes, and emotions of users which they express on social media and other online resources. The revolution of social media sites has also attracted the users towards video sharing sites, such as YouTube. The online users express their opinions or sentiments on the videos that the...

متن کامل

User sentiment detection: a YouTube use case

In this paper we propose an unsupervised lexicon-based approach to detect the sentiment polarity of user comments in YouTube. Polarity detection in social media content is challenging not only because of the existing limitations in current sentiment dictionaries but also due to the informal linguistic styles used by users. Present dictionaries fail to capture the sentiments of community-created...

متن کامل

Content Strategy and Fan Engagement in Social Media The Case of PyeongChang Winter Olympic And Paralympic Games

Background. This paper investigates the pillars of content strategy and fan engagement in social networks during 2018 PyeongChang Winter Olympics and Paralympics. Objectives. The purpose of this paper is to seek reasons behind the differences in fan engagement in social media channels of PyeongChang Winter Olympics and Paralympics. Methods. Facebook and YouTube channels are used to analyze en...

متن کامل

Enhancing Web intelligence with the content of online video fragments

This demo will show work to enhance a Web intelligence platform which crawls and analyses online news and social media content about climate change topics to uncover sentiment and opinions around those topics over time to also incorporate the content within non-textual media, in our case YouTube videos. YouTube contains a lot of organisational and individual opinion about climate change which c...

متن کامل

Sentiment analysis methods in Sentiment analysis methods in Persian text: A survey

With the explosive growth of social media such as Twitter, reviews on e-commerce website, and comments on news websites, individuals and organizations are increasingly using opinions in these media for their decision making. Sentiment analysis is one of the techniques used to analyze userschr('39') opinions in recent years. Persian language has specific features and thereby requires unique meth...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014